Is Sentence Compression an NLG task?

نویسندگان

  • Erwin Marsi
  • Emiel Krahmer
  • Iris Hendrickx
  • Walter Daelemans
چکیده

Data-driven approaches to sentence compression define the task as dropping any subset of words from the input sentence while retaining important information and grammaticality. We show that only 16% of the observed compressed sentences in the domain of subtitling can be accounted for in this way. We argue that part of this is due to evaluation issues and estimate that a deletion model is in fact compatible with approximately 55% of the observed data. We analyse the remaining problems and conclude that in those cases word order changes and paraphrasing are crucial, and argue for more elaborate sentence compression models which build on NLG work.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Domain proposal: Sentence generation with tree-adjoining grammars (CRISP)

Overview The CRISP domain is motivated by the problem of sentence generation in natural language generation (NLG).1 NLG is a major subfield of natural language processing, concerned with computing natural language sentences or texts that convey a given piece of information to an audience. In the sentence generation task, we focus on generating a single sentence that expresses a given meaning ac...

متن کامل

Génération de phrases multilingues par apprentissage automatique de modèles de phrases. (Multilingual Natural Language Generation using sentence models learned from corpora)

Multilingual Natural Language Generation using sentence models learned from corpora Natural Language Generation (NLG) is the natural language processing task of generating natural language from a machine representation system. In this thesis report, we present an architecture of NLG system relying on statistical methods. The originality of our proposition is its ability to use a corpus as a lea...

متن کامل

Experiments in Linear Template Combination using Genetic Algorithms

Natural Language Generation systems typically have two parts ­ strategic (" what to say ") and tactical (" how to say "). We present our experiments in building an unsupervised corpus­driven template based tactical NLG system. We consider templates as a sequence of words containing gaps. Our idea is based on the observation that templates are grammatical locally (within their textual span). We ...

متن کامل

Modality in Dialogue: Planning, Pragmatics and Computation

Natural language generation (NLG) is first and foremost a reasoning task. In this reasoning, a system plans a communicative act that will signal key facts about the domain to the hearer. In generating action descriptions, this reasoning draws on characterizations both of the causal properties of the domain and the states of knowledge of the participants in the conversation. This dissertation sh...

متن کامل

برچسب‌زنی نقش معنایی جملات فارسی با رویکرد یادگیری مبتنی بر حافظه

Abstract Extracting semantic roles is one of the major steps in representing text meaning. It refers to finding the semantic relations between a predicate and syntactic constituents in a sentence. In this paper we present a semantic role labeling system for Persian, using memory-based learning model and standard features. Our proposed system implements a two-phase architecture to first identify...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009